Search CORE

19 research outputs found

Improving cache Behavior in CMP architectures throug cache partitioning techniques

Author: Moretó Planas Miquel
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2010
Field of study

The evolution of microprocessor design in the last few decades has changed significantly, moving from simple inorder single core architectures to superscalar and vector architectures in order to extract the maximum available instruction level parallelism. Executing several instructions from the same thread in parallel allows significantly improving the performance of an application. However, there is only a limited amount of parallelism available in each thread, because of data and control dependences. Furthermore, designing a high performance, single, monolithic processor has become very complex due to power and chip latencies constraints. These limitations have motivated the use of thread level parallelism (TLP) as a common strategy for improving processor performance. Multithreaded processors allow executing different threads at the same time, sharing some hardware resources. There are several flavors of multithreaded processors that exploit the TLP, such as chip multiprocessors (CMP), coarse grain multithreading, fine grain multithreading, simultaneous multithreading (SMT), and combinations of them.To improve cost and power efficiency, the computer industry has adopted multicore chips. In particular, CMP architectures have become the most common design decision (combined sometimes with multithreaded cores). Firstly, CMPs reduce design costs and average power consumption by promoting design re-use and simpler processor cores. For example, it is less complex to design a chip with many small, simple cores than a chip with fewer, larger, monolithic cores.Furthermore, simpler cores have less power hungry centralized hardware structures. Secondly, CMPs reduce costs by improving hardware resource utilization. On a multicore chip, co-scheduled threads can share costly microarchitecture resources that would otherwise be underutilized. Higher resource utilization improves aggregate performance and enables lower cost design alternatives.One of the resources that impacts most on the final performance of an application is the cache hierarchy. Caches store data recently used by the applications in order to take advantage of temporal and spatial locality of applications. Caches provide fast access to data, improving the performance of applications. Caches with low latencies have to be small, which prompts the design of a cache hierarchy organized into several levels of cache.In CMPs, the cache hierarchy is normally organized in a first level (L1) of instruction and data caches private to each core. A last level of cache (LLC) is normally shared among different cores in the processor (L2, L3 or both). Shared caches increase resource utilization and system performance. Large caches improve performance and efficiency by increasing the probability that each application can access data from a closer level of the cache hierarchy. It also allows an application to make use of the entire cache if needed.A second advantage of having a shared cache in a CMP design has to do with the cache coherency. In parallel applications, different threads share the same data and keep a local copy of this data in their cache. With multiple processors, it is possible for one processor to change the data, leaving another processor's cache with outdated data. Cache coherency protocol monitors changes to data and ensures that all processor caches have the most recent data. When the parallel application executes on the same physical chip, the cache coherency circuitry can operate at the speed of on-chip communications, rather than having to use the much slower between-chip communication, as is required with discrete processors on separate chips. These coherence protocols are simpler to design with a unified and shared level of cache onchip.Due to the advantages that multicore architectures offer, chip vendors use CMP architectures in current high performance, network, real-time and embedded systems. Several of these commercial processors have a level of the cache hierarchy shared by different cores. For example, the Sun UltraSPARC T2 has a 16-way 4MB L2 cache shared by 8 cores each one up to 8-way SMT. Other processors like the Intel Core 2 family also share up to a 12MB 24-way L2 cache. In contrast, the AMD K10 family has a private L2 cache per core and a shared L3 cache, with up to a 6MB 64-way L3 cache.As the long-term trend of increasing integration continues, the number of cores per chip is also projected to increase with each successive technology generation. Some significant studies have shown that processors with hundreds of cores per chip will appear in the market in the following years. The manycore era has already begun. Although this era provides many opportunities, it also presents many challenges. In particular, higher hardware resource sharing among concurrently executing threads can cause individual thread's performance to become unpredictable and might lead to violations of the individual applications' performance requirements. Current resource management mechanisms and policies are no longer adequate for future multicore systems.Some applications present low re-use of their data and pollute caches with data streams, such as multimedia, communications or streaming applications, or have many compulsory misses that cannot be solved by assigning more cache space to the application. Traditional eviction policies such as Least Recently Used (LRU), pseudo LRU or random are demand-driven, that is, they tend to give more space to the application that has more accesses to the cache hierarchy.When no direct control over shared resources is exercised (the last level cache in this case), it is possible that a particular thread allocates most of the shared resources, degrading other threads performance. As a consequence, high resource sharing and resource utilization can cause systems to become unstable and violate individual applications' requirements. If we want to provide a Quality of Service (QoS) to applications, we need to enhance the control over shared resources and enrich the collaboration between the OS and the architecture.In this thesis, we propose software and hardware mechanisms to improve cache sharing in CMP architectures. We make use of a holistic approach, coordinating targets of software and hardware to improve system aggregate performance and provide QoS to applications. We make use of explicit resource allocation techniques to control the shared cache in a CMP architecture, with resource allocation targets driven by hardware and software mechanisms.The main contributions of this thesis are the following:- We have characterized different single- and multithreaded applications and classified workloads with a systematic method to better understand and explain the cache sharing effects on a CMP architecture. We have made a special effort in studying previous cache partitioning techniques for CMP architectures, in order to acquire the insight to propose improved mechanisms.- In CMP architectures with out-of-order processors, cache misses can be served in parallel and share the miss penalty to access main memory. We take this fact into account to propose new cache partitioning algorithms guided by the memory-level parallelism (MLP) of each application. With these algorithms, the system performance is improved (in terms of throughput and fairness) without significantly increasing the hardware required by previous proposals.- Driving cache partition decisions with indirect indicators of performance such as misses, MLP or data re-use may lead to suboptimal cache partitions. Ideally, the appropriate metric to drive cache partitions should be the target metric to optimize, which is normally related to IPC. Thus, we have developed a hardware mechanism, OPACU, which is able to obtain at run-time accurate predictions of the performance of an application when running with different cache assignments.- Using performance predictions, we have introduced a new framework to manage shared caches in CMP architectures, FlexDCP, which allows the OS to optimize different IPC-related target metrics like throughput or fairness and provide QoS to applications. FlexDCP allows an enhanced coordination between the hardware and the software layers, which leads to improved system performance and flexibility.- Next, we have made use of performance estimations to reduce the load imbalance problem in parallel applications. We have built a run-time mechanism that detects parallel applications sensitive to cache allocation and, in these situations, the load imbalance is reduced by assigning more cache space to the slowest threads. This mechanism, helps reducing the long optimization time in terms of man-years of effort devoted to large-scale parallel applications.- Finally, we have stated the main characteristics that future multicore processors with thousands of cores should have. An enhanced coordination between the software and hardware layers has been proposed to better manage the shared resources in these architectures

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura

Absorció d'hexoses a través de l'epiteli intestinal

Author: Ferrer Ruth
Moretó Miquel
Planas Joana Maria
Publication venue: Institut d'Estudis Catalans
Publication date: 01/01/1995
Field of study

Revistes Catalanes amb Accés Obert

Absorció intestinal de monosacàrids

Author: Ferrer Ruth
Moretó Miquel
Planas Joana Maria
Publication venue: Institut d'Estudis Catalans
Publication date: 01/01/1989
Field of study

Revistes Catalanes amb Accés Obert

Fisiologia dels cecs de pollastre

Author: Ferrer Ruth
Moretó Miquel
Planas Joana Maria
Publication venue: Institut d'Estudis Catalans
Publication date: 01/01/1985
Field of study

The morphology and physiology of the chicken cecum is reviewed. The large intestine of the chicken is formed by the rectum and the cloaca, and two well developed ceca, which are two blind sacs, tubular in shape, that originate at the junction of the small intestine and the rectum. Light microscopy observation of the epithelium demonstrates that the proximal region has well developed villi, in contrast to distal cecum where they are either small or absent. Information hitherto available on the physiological mechanisms underlying the filling and emptying of the cecurn has been revised, and it is particularly worthy of note that cecum contents may have both ileal and rectal origins. Several functions have been suggested for the cecum of the chicken but much remains to be discovered about its real physiological significance. There is evidence that proteins and complex carbohydrates can be partially digested in the ceca. It is also claimed that the cecum is the site of production of significant amounts of free volatile fatty acids and vitamins, among other compounds. Several authors have observed absorption of electrolytes and water in the cecum. Urine can enter the ceca by a retrograde flux, which suggests that the cecal epithelium plays a role in osmoregulation in fowl, a function which may be of special relevance in dehydrated animals. Recent reports on the sugar-transport characteristics of the chicken cecum indicate that the epithelium of the proximal area possesses an active transport system as efficient as that described in the small intestine. This raises the possibility that ceca are significant in chicken nutrition, since sugar uptake can occur both during cecal filling and emptying. The physiological role of chicken ceca is not well understood. However, it has been shown that ceca are not essential for animal survival, at least in environmental conditions allowing normal feeding and hydration

Revistes Catalanes amb Accés Obert

Absorció intestinal de monosacàrids.

Author: Ferrer i Roig Ruth
Moretó Miquel, 1950-
Planas i Rosselló Joana M.
Publication venue: Societat Catalana de Biologia
Publication date: 16/01/2019
Field of study

El intestino es un órgano muy especializado, cuya funcion primaria es captar los nutrientes que constituyen la dieta para transferirlos al torrente sanguineo, desde donde seran distribuidos a todo el organismo. El presente trabajo es una recopilación de los conocimientos actuales sobre los mecanismos implicados en la absorción intestinal de monosacáridos. Dicho proceso consiste en el paso de los monosacáridos a través de la membrana luminal (borde 'en cepillo') por transporte mediado concentrativo (transporte active) y por difusión simple. Una vez en el interior del enterocito, los monosacáridos salen de la célula a través de la membrana basolateral, ya sea por transporte mediado equilibrativo (difusión facilitada) o bien per difusión simple. De esta manera alcanzan la lámina propia, desde donde Ilegan al compartimiento plasmático. El proceso que tiene lugar en la membrana luminal consiste en el cotransporte del substrato y del ion sodio y esta mantenido per el gradiente electroquímico de Na+ y por el potencial de membrana. Se trata de un mecanismo dependiente de la energia metabólica y de la presencia de Na+ en el medio extracelular

Diposit Digital de la Universitat de Barcelona

Aplicació de la metodologia POGIL a l'aprenentatge de la fisiologia

Author: Amat Concepció
Juan i Olivé M. Emília
Moretó Miquel, 1950-
Planas i Rosselló Joana M.
Pérez Bosque Anna
Publication venue: 'Edicions de la Universitat de Barcelona'
Publication date: 01/02/2015
Field of study

Podeu consultar la Vuitena trobada de professorat de Ciències de la Salut completa a: http://hdl.handle.net/2445/66524POGIL és l’acrònim de “Process Oriented Guided Inquiry Learning”. És una metodologia que encaixa amb el constructivisme: l’estudiant construeix el coneixement a partir de qüestions i tasques proposades pel professor. Una activitat POGIL consisteix en una sessió a l’aula en la que els estudiants treballen sobre un qüestionari amb activitats diverses, de forma que al final són ells mateixos els que elaboren el tema..

Diposit Digital de la Universitat de Barcelona

Absorció d'hexoses a través de l'epiteli intestinal.

Author: Amat Concepció
Ferrer i Roig Ruth
Moretó Miquel, 1950-
Planas i Rosselló Joana M.
Publication venue: Societat Catalana de Biologia
Publication date: 16/01/2019
Field of study

En aquesta revisió es fa una recopilació dels coneixements actuals sobre els mecanismes implicats en 1'absorció intestinal de monosacàrids. A més dels mecanismes inespecífics,caracteritzats pel moviment passiu a favor de gradient per les vies transcel·lular i paracel·lular, els monosacàrids són transportats per proteïnes específiques de la membrana de 1'enteròcit. Aquests transportadors son proteïnes que presenten dotze segments transmembrana, i s'agrupen en dues grans famílies: els SGLT, caracteritzats per cotransportar Na+ i hexosa i ésser capaços d'acumular el substrat en contra de gradient, i els GLUT, que són transportadors facilitats que només funcionen a favor de gradient. Durant el proces d'absorció, 1'SGLT1 apical transporta D-glucosa i D-galactosa amb gran afinitat mentre que la D-fructosa penetra a traves del GLUT5. A la membrana basolateral hi ha un sistema de baixa afinitat, anomenat GLUT2 que transporta tots tres monosacàrids cap al medi intern. La regulació del transport de monosacàrids depèn de factors genètics i del control que exerceixen els mateixos substrats transportats. En tractar-se de nutrients no essencials amb una funció merament energètica, el patró adaptatiu consisteix en la inducció de transport pel mateix substrat present a la llum intestinal. L'augment de la capacitat de transport es correlaciona amb 1'expressió o la densitat de SGLT1, GLUT5 i GLUT2. A elevades concentracions luminals de monosacàrids , la regulació de la permeabilitat de la via paracel·lular pot representar una fracció considerable del flux transepitelial d'aquest tipus de nutrients

Diposit Digital de la Universitat de Barcelona